perm filename A54.TEX[106,PHY] blob sn#848176 filedate 1987-11-04 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	\magnification\magstephalf
C00012 ENDMK
C⊗;
\magnification\magstephalf
\input macro.tex
\def\today{\ifcase\month\or
  January\or February\or March\or April\or May\or June\or
  July\or August\or September\or October\or November\or December\fi
  \space\number\day, \number\year}
\baselineskip 14pt
\rm
\line{\sevenrm a54.tex[106,phy] \today\hfill}

\bigskip
\line{{\bf Do-It-Yourself Input/Output.} (Assumes assignment, {\tt CHAR}, overflow)
\hfill}

Reading and printing of decimal numbers is not a primitive operation on most
computers, so Pascal provides standard subalgorithms in the {\tt READ}
and {\tt WRITE} procedures to handle most of the
common situations that arise.  Now and then, however, a problem comes up
where the built-in mechanisms are inadequate, and the programmer must
explicitly handle the individual decimal digits of numbers moving to or
from files.  Suppose we have a number~{\tt N}, known to be greater than 100,000
and less than 1,000,000, which we want to print punctuated with a comma
after the thousands digit, like `123,456'.  The {\tt WRITE} statement for numbers
does not provide for this form, so the program will have to execute
{\tt WRITE(',')}, preceded by something that writes out (say) 123, and followed by
something that writes 456.

We can calculate the numbers to be printed before and after the comma as
{\tt N DIV 1000} and {\tt N MOD 1000} repectively, and if {\tt N} 
is 123456, the statement
{\tt WRITE(N DIV 1000:3,',',N MOD 1000:3)} prints

{\obeylines\obeyspaces\let =\ \tt
        123,456
}

\noindent
as desired.  If {\tt N} is 123004, however, we get an unpleasant surprise:

{\obeylines\obeyspaces\let =\ \tt
        123,  4
}

\noindent
because the {\tt WRITE} statement never prints leading zeroes.  For the part of the
number that follows the comma, at least, we must work out the individual
digits, and print them separately.  To get the tens digit requires two
steps.  Taking {\tt N MOD 100} 
gives us the numerical value of the two right hand
digits (56~and 4 in the examples above), after which a division by~10 gives
the tens digit.  To print a six digit integer {\tt N} without supressing leading
zeroes can be done with this subprogram:

{\obeylines\obeyspaces\let =\ \tt
        D:=1000000;
        FOR I:=1 TO 6 DO
            BEGIN
            D:=D DIV 10; (* FIRST TIME 100000, LAST TIME 1 *)
            Q:=N DIV D; (* NEXT DIGIT OF N *)
            WRITE(Q:1);
            N:=N MOD D
            END
}

\noindent
To make it print a comma after the thousands digit, insert
{\tt IF I=3 THEN WRITE (',')} after the {\tt WRITE} statement.

\vfill\eject

{\rmn
{\narrower\smallskip\noindent
{\bf Exercise.} 
If we don't know that {\tt N} is greater than 100,000, a more complicated
algorithm is needed.  Write a program that will print {\tt N} in one of these
forms:

{\obeylines\obeyspaces\let =\ \tt
        1,234,567
        123,456
        12,345
        1,234
        123
        12
        1
        0
}

\smallskip\noindent
(Solution:  make {\tt N=0} a special case, otherwise remember 
whether there has been a
non-zero digit, supressing all printing until a non-zero digit has been
encountered.)
\smallskip}
}

Suppose we want to read integers containing commas, ignoring the commas,
and stopping at the first blank space.  Reading into an integer variable
can't handle the task, so we have to read individual characters, building
up the value of the number from the values of the indivdiual characters.
While the ordinal values of the digit characters are not the same as their
numerical values (this is because the digits are not the first characters
in standard alphabetical order), they are consecutive, so we can read a
digit into a {\tt CHAR} variable and get its numerical value {\tt DIGIT} by 

\smallskip\halign{\qq\qq\lft{\tt #}\cr
	READ (C);\cr
	DIGIT:=ORD(C)-ORD('O').\cr}

\smallskip
We get the value of a multi-digit member like 123456 from the values of its
individual digits by treating it as a polynomial
$$1\times  10↑5 + 2\times   10↑4 + 3\times   10↑3 + 4\times   10↑2 + 5
\times  10 + 6\,,$$
or equivalently
$$\bigl(\bigl(\bigl((1\times 10+2)\times 10+3\bigr)\times 10+4\bigr)
\times 10+5\bigr)\times 10+6\,.$$

Now the plan of the algorithm becomes clear.  We read the characters of the
number, discarding commas, and keeping track of the numerical value of the
part of the number so far seen:

{\obeylines\obeyspaces\let =\ \tt
        N:=0;
        READ(C);
        WHILE C <> ` ' DO
            BEGIN
            IF C <> `,' THEN
                N:=N*10+ORD(C)-ORD('O');
            READ(C);
            END;

        (* N CONTAINS THE NUMBER READ *)
}

This program is not fussy about what it reads.  If the first input character is
blank, it treats that as a way of writing~0.  If the input string is ,,35,0,
it treats that as a way of writing 350.  A better program would watch for such
inputs as possible errors.

{\rmn
{\narrower\smallskip\noindent
{\bf Exercise.} 
The subprogram above fails if the number read is close to the maximum integer
value {\tt MAXINT}.  Revise it so it can read any positive integer up to 
{\tt MAXINT}.
[Use {\tt N:=N*10+(ORD(C)-ORD('0'))}.]

\smallskip\noindent
{\bf Exercise.} 
Revise it to detect an attempt to read a number larger than {\tt MAXINT}.  Why is

{\obeylines\obeyspaces\let =\ \tt
        IF N>MAXINT THEN WRITE ('TOO BIG')
}

\noindent
not a good approach?

\smallskip\noindent
{\bf Answer.} The assignment to~{\tt N} will make it overflow if 
{\tt N*10+(ORD(C)-ORD('O'))>MAXINT}; if {\tt MAXINT-ORD(C)+ORD('O')<N*10};
if {\tt MAXINT-ORD(C)+ORD('O')DIV 10<N}. The latter condition can safely
be tested without overflow of intermediate results. It does no good to
test whether {\tt N} has already overflowed, because the program fails
before reaching the test.

\smallskip
\noindent
{\bf Exercise.} 
Revise the reading algorithm to read integers written in base~16, with the
letters {\tt A} through {\tt F} serving as digits with values 10 through~15.
\smallskip}
}


\bigskip
\parindent0pt
\copyright 1984 Robert W. Floyd

First draft March 28, 1984

\bye